skip to main content


Search for: All records

Creators/Authors contains: "Wei, Dong"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Given m users (voters), where each user casts her preference for a single item (candidate) over n items (candidates) as a ballot, the preference aggregation problem returns k items (candidates) that have the k highest number of preferences (votes). Our work studies this problem considering complex fairness constraints that have to be satisfied via proportionate representations of different values of the group protected attribute(s) in the top- k results. Precisely, we study the margin finding problem under single ballot substitutions , where a single substitution amounts to removing a vote from candidate i and assigning it to candidate j and the goal is to minimize the number of single ballot substitutions needed to guarantee that the top-k results satisfy the fairness constraints. We study several variants of this problem considering how top- k fairness constraints are defined, (i) MFBinaryS and MFMultiS are defined when the fairness (proportionate representation) is defined over a single, binary or multivalued, protected attribute, respectively; (ii) MF-Multi2 is studied when top- k fairness is defined over two different protected attributes; (iii) MFMulti3+ investigates the margin finding problem, considering 3 or more protected attributes. We study these problems theoretically, and present a suite of algorithms with provable guarantees. We conduct rigorous large scale experiments involving multiple real world datasets by appropriately adapting multiple state-of-the-art solutions to demonstrate the effectiveness and scalability of our proposed methods. 
    more » « less
  2. Supramolecular nanocages with inner cavities have attracted increasing attention due to their fascinating molecular aesthetics and vast number of potential applications. Even though a wide array of discrete supramolecular cages with precisely designed sizes and shapes have been established, the controlled assembly of higher-order supramolecular frameworks from discrete molecular entities still represents a formidable challenge. In this work, a novel metallo-organic cage [Zn12L4] was assembled based on a triphenylene-cored hexapod terpyridine ligand. Synchotron X-ray analysis revealed a pair of enantiomeric cages in the crystal with flexible ligands twisted clockwise or anticlockwise due to steric hindrance in the structure. Interestingly, due to the strong π–π intermolecular interaction between triphenylene units, a controlled hierarchical packing of sphere-like cages in the crystal was established having a sparse packing mode with huge channels of around 3.6 nm diameter. This research sheds light on the design of strong π–π interactions in supramolecular hierarchical packing and materials science. 
    more » « less
  3. Abstract

    Although there are a large number of structural variations in the chromosomes of each individual, there is a lack of more accurate methods for identifying clinical pathogenic variants. Here, we proposed SVPath, a machine learning-based method to predict the pathogenicity of deletions, insertions and duplications structural variations that occur in exons. We constructed three types of annotation features for each structural variation event in the ClinVar database. First, we treated complex structural variations as multiple consecutive single nucleotide polymorphisms events, and annotated them with correlation scores based on single nucleic acid substitutions, such as the impact on protein function. Second, we determined which genes the variation occurred in, and constructed gene-based annotation features for each structural variation. Third, we also calculated related features based on the transcriptome, such as histone signal, the overlap ratio of variation and genomic element definitions, etc. Finally, we employed a gradient boosting decision tree machine learning method, and used the deletions, insertions and duplications in the ClinVar database to train a structural variation pathogenicity prediction model SVPath. These structural variations are clearly indicated as pathogenic or benign. Experimental results show that our SVPath has achieved excellent predictive performance and outperforms existing state-of-the-art tools. SVPath is very promising in evaluating the clinical pathogenicity of structural variants. SVPath can be used in clinical research to predict the clinical significance of unknown pathogenicity and new structural variation, so as to explore the relationship between diseases and structural variations in a computational way.

     
    more » « less
  4. Peer groups leverage the presence of knowledgeable individuals in order to increase the knowledge level of other participants. The `smart' formation of peer groups can thus play a crucial role in educational settings, including online social networks and learning platforms. Indeed, the targeted groups formation problem, where the objective is to maximize a measure of aggregate knowledge, has received considerable attention in recent literature. In this paper we initiate a dynamic variant of the problem that, unlike previous works, allows the change of group composition over time while still targeting to maximize the aggregated knowledge level. The problem is studied in a principled way, using a realistic learning gain function and for two different interaction modes among the group members. On the algorithmic side, we present DyGroups, a generic algorithmic framework that is greedy in nature and highly scalable. We present non-trivial proofs to demonstrate theoretical guarantees for DyGroups in a special case. We also present real peer learning experiments with humans, and perform synthetic data experiments to demonstrate the effectiveness of our proposed solutions by comparing against multiple appropriately selected baseline algorithms. 
    more » « less
  5. null (Ed.)
  6. null (Ed.)